12 research outputs found

    Universal Dependencies for Learner English

    Get PDF
    We introduce the Treebank of Learner English (TLE), the first publicly available syntactic treebank for English as a Second Language (ESL). The TLE provides manually annotated POS tags and Universal Dependency (UD) trees for 5,124 sentences from the Cambridge First Certificate in English (FCE) corpus. The UD annotations are tied to a pre-existing error annotation of the FCE, whereby full syntactic analyses are provided for both the original and error corrected versions of each sentence. Further on, we delineate ESL annotation guidelines that allow for consistent syntactic treatment of ungrammatical English. Finally, we benchmark POS tagging and dependency parsing performance on the TLE dataset and measure the effect of grammatical errors on parsing accuracy. We envision the treebank to support a wide range of linguistic and computational research o n second language acquisition as well as automatic processing of ungrammatical language.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF – 1231216

    Relatório de estágio em farmácia comunitária

    Get PDF
    Relatório de estágio realizado no âmbito do Mestrado Integrado em Ciências Farmacêuticas, apresentado à Faculdade de Farmácia da Universidade de Coimbr

    The structure of attitude reports : representing context in grammar

    No full text
    Thesis: Ph. D. in Linguistics, Massachusetts Institute of Technology, Department of Linguistics and Philosophy, September, 2020Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (pages 177-183).This dissertation argues for a view of grammar that encodes certain facts about the discourse context in the narrow syntax. In particular, the recurring claim that there are clause peripheral elements that correspond to a kind of perspectival center is supported by novel evidence that this perspectival element can be overt in certain languages. This is shown using data from attitude reports in Tigrinya (Semitic, Eritrea), which overtly realizes a perspective holder, as well as a diverse collection of other languages, including Ewe and Malayalam. In analyzing this construction, I propose that the certain complementizers have a secondary use as a marker of reported speech. I unify this use of complementizers with their more common clausal subordination use by adopting the proposal in Kratzer (2006), which argues that the modal quantification component of attitude reports is in the complementizer, rather than the attitude predicate, as is commonly assumed. I also analyze two unique properties of these reportative complementizer constructions, indexical shift and logophoricity. In Tigrinya, indexical shift can be accounted for by allowing these reportative complementizers to quantify over contexts, rather than worlds, and by introducing a contextshifting operator. From a morphosyntactic perspective, I find evidence from indexical shift that person features must be assigned throughout the course of the derivation, rather than at the point of lexical insertion. I also find that these constructions create contexts for matrix clause indexical shift in Tigrinya, something that has not previously been observed. Evidence from Ewe and other languages suggests a correlation between logophoric domains and the presence of a complementizer with reportative properties. Based on this distinction, I argue that Condition A-violating reflexives in languages like French and English are not reducible to logophors, based on their distribution, as well as other syntactic properties.by Carolyn Spadine.Ph. D. in LinguisticsPh.D.inLinguistics Massachusetts Institute of Technology, Department of Linguistics and Philosoph

    Universal Dependencies 1.4

    No full text
    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008)

    Universal Dependencies 2.4

    No full text
    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008)

    Universal Dependencies 2.5

    No full text
    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008)

    Universal Dependencies 2.5

    No full text
    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008)

    Universal Dependencies 2.6

    No full text
    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008)

    Universal Dependencies 2.7

    No full text
    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008)

    Universal Dependencies 2.10

    No full text
    Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008)
    corecore